Speech/music discrimination : novel features in time domain

نویسنده

  • Muhammad Saeid Muhammad Alnadabi
چکیده

This research aimed to find novel features that can be used to discriminate between speech and music in the time domain for the purpose of data retrieval. The study used speech and music data that were recorded in standard anechoic chambers and sampled at 44.1 kHz. Two types of new features were found and thoroughly examined: the Ratio of Silent Frames (RSF) feature and the Time Series Events (TSE) set of features. The Receiver Operating Characteristics (ROC) curves were used to assess each one of the proposed features as well as certain relevant features from the literature for the purpose of comparison. The RSF feature introduced up to 8% enhancement when compared to a couple of relevant features from the literature. One of the TSE set of features provided close to 100% speech/music discrimination.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Features for Effective Speech and Music Discrimination

Speech and music discrimination is one of the most important issues for multimedia information retrieval and efficient coding. While many features have been proposed, seldom of which show robustness under noisy condition, especially in telecommunication applications. In this paper two novel features based on real cepstrum are presented to represent essential differences between music and speech...

متن کامل

Audio classification using dominant spatial patterns in time-frequency space

This paper presents a novel audio discrimination algorithm using spatial features in time-frequency (TF) space. Three types of audio signals – speech, music without vocal and music with background vocal are taken into consideration for classification. The audio segment is transformed into TF domain yielding the spatial illustration of energy. Nonnegative matrix factorization (NMF) is applied to...

متن کامل

Speech/music discrimination for multimedia applications

Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances, and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5–5 seconds, and are relatively complex. In this paper, we present our results of combining the lin...

متن کامل

Pitch expertise is not created equal: Cross-domain effects of musicianship and tone language experience on neural and behavioural discrimination of speech and music.

Psychophysiological evidence supports a music-language association, such that experience in one domain can impact processing required in the other domain. We investigated the bidirectionality of this association by measuring event-related potentials (ERPs) in native English-speaking musicians, native tone language (Cantonese) nonmusicians, and native English-speaking nonmusician controls. We te...

متن کامل

A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010